------------------------------
PDS IMG Reading Utility (v2.1)
------------------------------
(revised 2010 Sep 16)

-------
Purpose
-------

This Windows executable, written in the Delphi 6 dialect of Pascal, can be used to explore most IMG format data available on NASA's PDS (Planetary Data Service) system.  The program can automatically assess the range of data values present in the image matrix and convert individual spectral bands of the file for visualization in 8-bit Windows BMP format, with or without a gamma correction, scaled so that 0-255 counts corresponds to a selectable range (with optional thumbnail preview), or to an ASCII text copy of the original data.  Because the data is loaded and examined only a single data value at a time, images larger than the memory capacity of the host computer can be analyzed and converted on disk. However since the data is accessed as a linear array of samples with a 32-bit integer index, it is, at present, limited to reading the first 2147483647 samples.


-------------
Sample Output
-------------

Examples of grayscale output produced by this program (converted to JPG) can be seen at:

  http://ltvt.wikispaces.com/Additional+Textures


------------------------------
Sources of Data and an Example
------------------------------

NASA's PDS has many "nodes" containing archives of data in IMG format.

IMG data consists of a machine readable matrix of intensity values, each cell containing a data value of some fixed bit length, sometimes integers sometimes real ("floating point"), sometimes "most significant bit first" ("Motorola" format) sometimes "least significant bit first" ("Intel" format).

The example image described below ("M107699540CC.IMG") is can be found at:

http://lroc.sese.asu.edu/data/LRO-L-LROC-3-CDR-V1.0/LROLRC_0001/DATA/MAP/2009259/WAC/M107699540CC.IMG


The details of the format can often be found in an ASCII text header embedded at the start of the IMG file, but sometimes it is in a separate "label" file that has the same name as the *.IMG, but with a *.LBL extension.

In the case of embedded labels, the number of bytes that have to be skipped over to get to the image data is specified in the embedded label.

A screenshot of the processing of the above-mentioned IMG example is included in the present folder.  The numbers entered in the program input boxes were derived based on the following embedded label information (as displayed with the =Read ASCII Header= button):

RECORD_BYTES                   = 704
FILE_RECORDS                   = 84879
LABEL_RECORDS                  = 15
...
OBJECT = IMAGE
  LINES                 = 21216
  LINE_SAMPLES          = 704
  SAMPLE_BITS           = 32
  SAMPLE_TYPE           = PC_REAL
  VALID_MINIMUM         = 16#FF7FFFFA#
  NULL                  = 16#FF7FFFFB#
...


The number of "Header bytes" is computed from LABEL_RECORDS*RECORD_BYTES = 10560.

The number of "Samples per line" is copied from LINE_SAMPLES.

The number of "Bits per sample" is copied from SAMPLE_BITS.

PC_REAL indicates the data is in real format and byte reversal is not needed on a PC

The non-data value of -3.4028227E+038 was determined by reading Line 1, Sample 1 (the first pixel in the image) with the =Retrieve Point= button and corresponds to the hexadecimal value listed as NULL.  These values are ignored in preparing the histogram and set to 0 intensity when converting the IMG to BMP.

The "Min data" and "Max data" boxes were automatically filled in using the =Histogram= button, testing the full range of image lines and samples.

Clicking =Convert File= produced a 14,937,142 byte BMP image consisting of 256 grayscale levels, and stripped of the embedded label (but adding a new header that is a required part of the BMP format).



--------------------
Function of Controls
--------------------
(in the following  =xxx= indicates a button visible on the screen)

********************************
**** Format definition area ****
********************************

=Input File=

Click this button to select the downloaded IMG file you want to explore.  After selecting the file, the input boxes should be set as explained in the preceding example. The most critical settings are the number of "Header bytes", the number of "Samples per line", the "Bits per sample", and, for multi-spectral data, the number of "Bands".  These items, labeled in red, are necessary to correctly locate the data cell for a given Line, Sample, and Band. 


**** Data format ****

A series of radio buttons is provided for further defining the format of the numeric data in the cells.  The formats that are supported are:

  Bits per sample   Formats
  ---------------   -------------------
         8           Unsigned integer
        16           Unsigned integer, Signed integer
        32           Unsigned integer, Signed integer, Real ("Single precision")
        64           Real ("Double precision")

If the currently selected format radio button is incompatible with the specified "Bits per sample" a compatible one will be automatically selected before any data is retrieved.

The "Reverse byte order" checkbox switches from "Little Endian" (the default for files written on Windows PC's) to "Big Endian" (the byte order of numbers written on certain other computer types).  The data will generally be unintelligible unless the correct byte order is specified.


**** Band structure definition ****

Many images on the PDS are actually compilations of images of the same area taken at a number of wavelengths ("Bands"), or equivalently sets of data values (not necessarily image intensity) referring to the same matrix of positions.

The PDS IMG Reading Utility can read the default forms of the three most popular schemes for encoding multi-spectral data: "Band Interleaved by Line" (BIL), "Band Interleaved by Pixel" (BIP), and "Band Sequential" (BSQ, for which the number of lines in a the image for a single band must be specified).

I nice explanation of the differences between these formats, using as an illustration a typical color image with three bands (red, green and blue), can be found at:

  http://help.arcgis.com/en/arcgisdesktop/10.0/help/009t/009t00000010000000.htm

See the appendix on Multi-spectral Data Formats, below, for more information including an explanation of the additional input boxes, normally set to "0", available for use when the data includes padding bytes.


********************************
**** Data testing/retrieval ****
********************************

=Retrieve Point=

Click this button to read and display the numeric image data in the specified cell of the image matrix. Following the PDS convention, the Line and Sample numbers run from 1 (rather than 0) to the height and width of the image. All the input boxes on the line above must be properly set to access the correct cell. 


*********************************************************
**** Area of interest definition/Histogram functions ****
*********************************************************

=Reset=

Sets the =Histogram= input boxes to the expected full range of the data area based on the current input boxes.  "Start Line", "Start Sample" and "Origin Sample" are set to 1. "End Sample" is set to "Samples per Line". For all data formats other than BSQ, "End Line" is automatically set according to the following algorithm:

  End Line = (Data bytes)/((Number of bands)*(Samples per image line)*(Bytes per sample))

where "Data bytes" is the total file size on disk (in bytes) minus the specified "Header bytes".  A warning is issued if the result is not an integral number of image lines.

=Histogram=

Clicking this button displays the minimum and maximum data values found in the rectangular section of the image defined by the Start and End boxes.  If "List Points" is checked, each individual value will be displayed. The "Ignore non-data value" check box allows a particular number (the "NULL" value) to be ignored in determining the minimum and maximum (this value is typically copied, by a cut and paste operation, from the extreme histogram result obtained *without* the "Ignore non-data value" option).  The "Min data" and "Max data" input boxes, which are used in converting the raw data to grayscale, can be automatically set by clicking =Histogram=.


*******************************************
**** File preview/conversion functions ****
*******************************************

=Preview=

Displays, in a pop-up window, a thumbnail grayscale image of the current selection. As for =Convert File=, the portion of the raw data displayed is determined by the Line and Sample limits (and Band, if applicable) specified for the histogram.  The "Size" box to the right of this button determines the maximum size of the thumbnail (in pixels) in its longest dimension. The =Preview= image is generated by retrieving the raw data only at the locations corresponding to the pixels in the Preview image (no matter how large the "Size" box is set, the =Preview= size will never exceed the dimensions of the original image).

The "Auto" checkbox causes the black and white display levels in the =Preview= image to be automatically set to the minimum and maximum data values encountered at the sampling points (ignoring the non-data value and applying the "Gamma" correction, if requested) so that the full range is visible. This allows large disk files to be rapidly previewed with a minimum of disk access.

Otherwise (if "Auto" is *not* checked) the grayscale intensities in the =Preview= image will be controlled by the current values in the "Min data", "Max data" and "Gamma" boxes, which should generally set by  first running the =Histogram= function over the full data set of interest. The =Preview= should give an accurate impression of the BMP result that would be produced by =Convert File= with the current settings. 

=Convert File=
Converts the section of the image specified in the Histogram input boxes to BMP grayscale format or an ASCII text file, and save it to disk. 

In the BMP mode, output levels of 0-255 are selected to correspond to the range of raw data values specified in the Min Data and Max Data boxes.  The conversion is linear unless a Gamma other than 1 is specified. If the "Ignore non-data value" box is checked, containing the NULL value will be set to the intensity (0) corresponding to the "Min data" value.  Note that a very peculiar result will likely be obtained in the "Min data" or "Max data" box contains the NULL value, since the output will be scaled to represent this as a possible valid data value. 

Note: the present Gamma correction is the reciprocal of the gamma often used for display purposes, for example in LTVT.  That is Gamma = 0.5 boosts shadow intensities, the same as an LTVT gamma of 1/0.5 = 2.0 would do at display (to a file without the correction here).

In the Text mode the image cell contents are output as ASCII text in either integer or scientific notation (8 or depending on the original format.  Each element, except the last, is followed by a TAB character.  Scaling and gamma correction is *not* applied when TXT format is requested, however if the "Ignore non-data value" box is checked, cells containing the NULL value will be replaced with the current contents of the Min Data input box.

The "Origin sample" input box can be used to alter the horizontal "wrapping" of the converted image (or text) without affecting the width. If "Origin sample" has a value between "Start sample" to "End sample", then instead of consisting of the samples from "Start sample" to "End sample", the output consists of the samples from "Origin sample" to "End sample" followed by those from "Start sample" to "Origin sample". This can be used, for example, to transform a data set that spans a longitude range of 0 to 360 into one spanning -180 to +180.  To do this, assuming Sample 1 = 0 and Sample NNNN = 360  where NNNN is an even integer, the desired effect is achieved by setting Origin sample = 1 + (NNNN/2), that is to the midpoint sample in the image lines.

=Abort=
Halts processing.  This button is visible during the =Histogram= and =Convert File= functions.


************************************************
**** Memo area and Header reading functions ****
************************************************

The memo box is used for displaying messages and requested data.

=Read ASCII Header=
Reads the start of the IMG file and displays them in the memo box as ASCII text. ASCII lines are defined as a series of bytes (each assumed to represent a character) followed by the Carriage Return (CR) - Line Feed (LF) sequence of two bytes. The "Max line Length" and/or "Max lines to read" input boxes may be set to non-zero values, in which case reading halts after the first non-compliant line is found; otherwise reading continues to the end of the file or until a line containing only the characters "END" is encountered (valid PDS headers are supposed to terminate with this sequence).  If the display is terminated by a line exceeding "Max line Length", the content of that line will be displayed followed by a series of asterisks (****).   
Note that not all IMG files have an embedded ASCII header.  In those cases the "Read ASCII Header" button will generate gibberish and one should look for a separate *.LBL text file, containing the data description, in the PDS archive.

=Clear=

Clears the memo box.


-------------------------------------
Appendix: Multi-spectral Data Formats
-------------------------------------

As indicated above, the PDS IMG Reading Utility can read the default forms of the three schemes for encoding multi-spectral data found on the PDS: "Band Interleaved by Line" (BIL), "Band Interleaved by Pixel" (BIP), and "Band Sequential" (BSQ).

The total storage space is always the same and the data is always stored in a single continuous linear sequence of samples, but the sequence differs.  For the RGB example, the layouts (with somewhat arbitrary carriage returns) look like:

BIL:

  RRRRRRRRRRRRRRR.... <-- line 1 in red
  GGGGGGGGGGGGGGG.... <-- line 1 in green
  BBBBBBBBBBBBBBB.... <-- line 1 in blue
  RRRRRRRRRRRRRRR.... <-- line 2 in red
  GGGGGGGGGGGGGGG....
  BBBBBBBBBBBBBBB....
  ...............
  ...............

BIP:

  RGBRGBRGBRGBRGB...  <-- line 1 data
  RGBRGBRGBRGBRGB...  <-- line 2 data
  RGBRGBRGBRGBRGB...
  ...............
  ...............

BSQ:

  RRRRRRRRRRRRRRR...  <-- complete image in red
  RRRRRRRRRRRRRRR...
  ...............
  ...............
  GGGGGGGGGGGGGGG...  <-- complete image in green
  GGGGGGGGGGGGGGG...
  ...............
  ...............
  BBBBBBBBBBBBBBB...  <-- complete image in blue
  BBBBBBBBBBBBBBB...
  ...............
  ...............

if:

    NS = number of samples per image line
    NL = number of image lines
    NB = number of bands

then:

   total storage space (bytes) = NS*NL*NB*NumBytesPerSample

and if  B, L, S are the desired  Band, Line and Sample  in a one-based index system (first band, line and sample = 1), then the sample number N in the linear array is:

BIL:   N = NS*NB*(L - 1) + NS*(B - 1) + S
BIP:   N = NS*NB*(L - 1) + NB*(S - 1) + B
BSQ:   N = NS*NL*(B - 1) + NS*(L - 1) + S

Note that in the case of data consisting of a single band (B = 1) these all reduce to:

       N = NS*(L - 1) + S

which says that in the first line (L = 1), the linear index equals the sample number (N = S).  The second line is the same, except you skip over NS samples to get to its start, and so on.  The formulas when B>1 have the same basic structure, but have additional larger units that you skip over.

Each of these forms can come with complicating variations. For example the standard windows BMP files consist of rows of monochrome or RBG 8-bit (1 byte) intensity values in BIP format -- however each such row is padded, if necessary, to make its total length equal to an even multiple of 4 bytes.  Similarly, data in BSQ format may have extra bytes ("gaps") inserted  between the successive images, and BIL data may have extra bytes at the end of each band row. The following input boxes are available to deal with these possible variations:

"Lines" = number of lines in the image for a single band (needed to decode BSQ format)
"Band pad" = extra bytes at end of band in BIL format
"Line pad" = extra bytes at end of image line in all formats
"Gap" = extra bytes between BSQ images

If a monochrome image requires any of these adjustments, it can be handled as any of the banded formats with "Bands" = 1.


-------------------------------------- 
Example of Reading Multi-spectral Data 
-------------------------------------- 

The present folder includes a screenshot of reading BIL data from the Moon Mineralogy Mapper (M3) instrument on Chandrayaan-1:

  http://the-moon.wikispaces.com/Chandrayaan-1

The example file processed is one of 51 lines with 85 bands (each 304 samples long) in each line, with no header.  The file can be found at:

  http://pds-imaging.jpl.nasa.gov/data/m3/CH1M3_0001/DATA/20081118_20090214/200901/L1B/

Specifically, the image ("M3G20090106T113234_V01_RDN.IMG") file is:

  http://pds-imaging.jpl.nasa.gov/data/m3/CH1M3_0001/DATA/20081118_20090214/200901/L1B/M3G20090106T113234_V01_RDN.IMG

and the format information in the associated text file "M3G20090106T113234_V01_L1B.LBL":

  http://pds-imaging.jpl.nasa.gov/data/m3/CH1M3_0001/DATA/20081118_20090214/200901/L1B/M3G20090106T113234_V01_L1B.LBL


When the BIL radio button is checked, input boxes appear for specifying the total number of bands and the particular band number you want to interrogate.  The =Retrieve Point=, =Histogram= and =Convert File= functions will then retrieve data for that band *only* and will ignore the data in the other bands.  The Histogram line and sample limits should be set to the expected size of the output image for a single band.

When the BSQ radio button is checked an additional input box will appear in which you must specify the number of lines present in a complete image (for a single band).  As indicated by the formula given above, this information is necessary for calculating the size of the complete image chunks that need to be skipped over to get to successive bands.


-----------
Source Code
-----------

The zipped distribution file includes a subfolder labeled "Delphi Source Code".  Within it are most of the files from which the Windows executable are generated.  Delphi 6 is a "Rapid Application Development" compiler that was once distributed for free.  See:

http://en.wikipedia.org/wiki/Embarcadero_Delphi

The Pascal instructions for doing the stacking are in the file "*.pas" but they call other "units" which can be found in the LTVT source code.  The remaining files in the "Delphi Source Code" define the graphic interface (the locations of "controls" on the Windows screen) and compiler options.  They may or may not be compatible with more recent versions of Delphi.


--------------------
Alternative Software
--------------------

PDS IMG files can also be opened as "RAW" data by a number of photo processing applications.  

1. The (US) National Institute of Health's ImageJ:

    http://rsbweb.nih.gov/ij/

is a good, free, and widely available example.  The present folder shows a screenshot of the parameters needed to open the LROC example file mentioned above in ImageJ.  Although the matrix of numbers is correctly loaded, the display has a highly posterized, black-and-white appearance because the cells with the NULL intensity of -3.402822655E+038 is treated as valid image data.  This establishes the black level, with the real data (all at intensities around 0.1) displays as white.

2. Bjorn Jonsson's IMG2PNG:

    http://www.mmedia.is/~bjj/utils/img2png/index.html

also permits most PDS IMG formats to be converted to 8 or 16-bit PNG image files.

3. NASA's NASAView Image Display Software:

    http://pds.nasa.gov/tools/nasa-view.shtml

permits screen display of most PDS images but saving of the decoded data is reportedly limited to 8-bit compressions (JPG or GIF).


--------------------
Contact/Distribution
--------------------

The present utility was written by Jim Mosher (jimmosher@yahoo.com). 

The executable and source code are freely available on the LTVT website at: 

  http://ltvt.wikispaces.com/Utility+Programs#PDS_IMG_Reader








